1 Specification of the Umsdos file system for LINUX 1 2 What is UMSDOS 1 3 General strategy 1 4 File name mangling 1 5 --linux-.---: the EMD file 4 6 Hard links 6 7 Symbolic links 7 8 Special file 7 9 Pseudo root 8 10 Dual mode 10 11 Miscellaneous 11 11.1 UMSDOS_create 11 11.2 UMSDOS_ioctl_dir 12 11.3 UMSDOS_lookup 14 11.4 UMSDOS_notify_change 15 11.5 UMSDOS_readdir 16 11.6 mount and UMSDOS_remount_fs 16 11.7 UMSDOS_rename 17 11.8 Data structure 18 11.9 Inode management 18 12 Synchronisation problems 18 13 Convention and style 19 14 Weakness and features 21 15 utilities 22 15.1 The UMSDOS synchroniser 22 15.2 Other 24 16 Test cases 25 16.1 utstgen 25 16.2 utstspc 27 1 17 The MsDOS fs 30 1 Specification of the Umsdos file system for LINUX This document describe the implementation of the UMSDOS file system for LINUX. It contains mostly implementation notes rather than a formal description of the concept. It is intended for a reader who already has a good knowledge of the LINUX VFS system. It is expected (hope) that such a person may find weakness (or bugs) in UMSDOS just by reading this document. It is the first document to read if you suspect a bug. 2 What is UMSDOS UMSDOS stand for "Unix in MSDOS" file system. UMSDOS is a full feature unixlike file system for LINUX. It operates within the limits (and semantics) of a normal MSDOS FAT file system. chkdsk won't complain at all. No dirty tricks. UMSDOS DOES not use MSDOS (the OS) to run. In case anyone wonder. Using a special file in each directory ("--linux-.---"), UMSDOS simulate the full UNIX semantic (I hope :-) ): Long file name case sensitive free format file name like "This.is.a.sample" Permissions and owner (user and group) Links (hard and symbolic) Device special and pipe UMSDOS is powerful enough to act as a ROOT file system for LINUX. From a Linux kernel standpoint, this is a true filesystem. Compile a kernel with UMSDOS, put it on a disquette, use rdev to specify the proper root partition (Any msdos partition with linux in it), and boot. In the following document, the --linux-.--- will be named EMD (Extension to Msdos Directory). 3 General strategy UMSDOS operates on top of the MSDOS fs for LINUX. Using the VFS function table, UMSDOS mostly intercept calls to MSDOS fs, do some translation and sometime carries itself the operation. Most of the job is directory search both in MSDOS fs and EMD. 4 File name mangling [linux/fs/umsdos/mangle.c,261] #Specification: file name / non MSDOS conforming / mangling Non MSDOS conforming file name must use some alias to fit in the MSDOS name space. 1 The strategy is simple. The name is simply truncated to 8 char. points are replace with underscore and a number is given as an extension. This number correspond to the entry number in the EMD file. The EMD file only need to carry the real name. Upper case is also convert to lower case. Control character are converted to #. Space are converted to #. The following character are also converted to #. " * + , / : ; < = > ? [ \ ] | ~ Sometime, the problem is not in MsDOS itself but in command.com. [linux/fs/umsdos/mangle.c,26] #Specification: file name / non MSDOS conforming / mangling Each non MSDOS conforming file has a special extension build from the entry position in the EMD file. This number is then transform in a base 32 number, where each digit is expressed like hexadecimal number, using digit and letter, except it uses 22 letters from 'a' to 'v'. The number 32 comes from 2**5. It is faster to split a binary number using a base which is a power of two. And I was 32 when I started this project. Pick your answer :-) . If the result is '0', it is replace with '_', simply to make it odd. This is true for the first two character of the extension. The last one is taken from a list of odd character, which are: { } ( ) ! ` ^ & @ With this scheme, we can produce 9216 ( 9* 32 * 32) different extensions which should not clash with any useful extension already popular or meaningful. Since most directory have much less than 32 * 32 files in it, the first character of the extension of any mangle name will be {. Here are the reason to do this (this kind of mangling). -The mangling is deterministic. Just by the extension, we are able to locate the entry in the EMD file. -By keeping to beginning of the file name almost unchanged, we are helping the MSDOS user. -The mangling produces names not too ugly, so an msdos user may live with it (remember it, type it, etc...). -The mangling produces names ugly enough so no one will ever think of using such a name in real life. This is not fool proof. I don't think there is a total solution to this. 2 [linux/fs/umsdos/mangle.c,306] #Specification: file name / MSDOS devices / mangling To avoid unreachable file from MsDOS, any MsDOS conforming file with a basename equal to one of the MsDOS pseudo devices will be mangled. If a file such as "prn" was created, it would be unreachable under MsDOS because prn is assumed to be the printer, even if the file does have an extension. Since the extension is unimportant to MsDOS, we must patch the basename also. We simply insert a minus '-'. To avoid conflict with valid file with a minus in front (such as "-prn"), we add an mangled extension like any other mangled file name. Here is the list of DOS pseudo devices: "prn","con","aux","nul", "lpt1","lpt2","lpt3","lpt4", "com1","com2","com3","com4", "clock$" and some standard ones for common DOS programs "emmxxxx0","xmsxxxx0","setverxx" (Thanks to Chris Hall for pointing these to me). Is there one missing ? [linux/fs/umsdos/mangle.c,257] #Specification: file name / --linux-.--- The name of the EMD file --linux-.--- is map to a mangled name. So UMSDOS does not restrict its use. [linux/fs/umsdos/mangle.c,147] #Specification: file name / non MSDOS conforming / base length 0 file name beginning with a period '.' are invalid for MsDOS. It needs absolutely a base name. So the file name is mangled [linux/fs/umsdos/mangle.c,220] #Specification: file name / non MSDOS conforming / mangling clash To avoid clash with the umsdos mangling, any file with a special character as the first character of the extension will be mangled. This solve the following problem: touch FILE # FILE is invalid for DOS, so mangling is applied # file.{_1 is created in the DOS directory touch file.{_1 # To UMSDOS file point to a single DOS entry. 3 # So file.{_1 has to be mangled. [linux/fs/umsdos/mangle.c,212] #Specification: file name / non MSDOS conforming / last char == . If the last character of a file name is a period, mangling is applied. MsDOS do not support those file name. [linux/fs/umsdos/mangle.c,139] #Specification: file name / too long If a file name exceed UMSDOS maxima, the file name is silently truncated. This makes it conformant with the other file system of Linux (minix and ext2 at least). 5 --linux-.---: the EMD file The strategy for inode management. UMSDOS lets the MSDOS fs run and does simple transformation to the in core inode content. It also adds information at the end of the inode structure. See /usr/include/linux/umsdos_fs_i.h. [/usr/include/linux/umsdos_fs.h,47] #Specification: EMD file / record size Entry are 64 bytes wide in the EMD file. It allows for a 30 characters name. If a name is longer, contiguous entries are allocated. So a umsdos_dirent may span multiple records. [linux/fs/umsdos/emd.c,282] #Specification: EMD file structure The EMD file uses a fairly simple layout. It is made of records (UMSDOS_REC_SIZE == 64). When a name can't be written is a single record, multiple contiguous record are allocated. [linux/fs/umsdos/emd.c,183] #Specification: EMD file / empty entries Unused entry in the EMD file are identify by the name_len field equal to 0. However to help future extension (or bug correction :-( ), empty entries are filled with 0. [/usr/include/linux/umsdos_fs_i.h,80] #Specification: strategy / in memory inode Here is the information specific to the inode of the UMSDOS file system. This information is added to the end of the standard struct inode. Each file system has its own extension to struct inode, so do the umsdos file system. The strategy is to have the umsdos_inode_info as a superset of the msdos_inode_info, since most of the time the job is done by the msdos fs code. 4 So we duplicate the msdos_inode_info, and add our own info at the end. For all file type (and directory) the inode has a reference to: the directory which hold this entry: i_dir_owner The EMD file of i_dir_owner: i_emd_owner The offset in this EMD file of the entry: pos For directory, we also have a reference to the inode of its own EMD file. Also, we have dir_locking_info to help synchronise file creation and file lookup. This data is sharing space with the pipe_inode_info not used by directory. See also msdos_fs_i.h for more information about pipe_inode_info and msdos_inode_info. Special file and fifo do have an inode which correspond to an empty MSDOS file. symlink are processed mostly like regular file. The content is the link. fifos add there own extension to the inode. I have reserved some space for fifos side by side with msdos_inode_info. This is just to for the show, because msdos_inode_info already include the pipe_inode_info. The UMSDOS specific extension is placed after the union. [/usr/include/linux/umsdos_fs_i.h,10] #Specification: strategy / in memory inode Here is the information specific to the inode of the UMSDOS file system. This information is added to the end of the standard struct inode. Each file system has its own extension to struct inode, so do the umsdos file system. The strategy is to have the umsdos_inode_info as a superset of the msdos_inode_info, since most of the time the job is done by the msdos fs code. So we duplicate the msdos_inode_info, and add our own info at the end. For all file type (and directory) the inode has a reference to: the directory which hold this entry: i_dir_owner The EMD file of i_dir_owner: i_emd_owner The offset in this EMD file of the entry: pos For directory, we also have a reference to the inode of its own EMD file. Also, we have dir_locking_info to help synchronise file creation and file lookup. This data is sharing space with the pipe_inode_info not used by directory. See also msdos_fs_i.h for more information about pipe_inode_info and msdos_inode_info. Special file and fifo do have an inode which correspond to an empty MSDOS file. symlink are processed mostly like regular file. The content is the link. 5 fifos add there own extension to the inode. I have reserved some space for fifos side by side with msdos_inode_info. This is just to for the show, because msdos_inode_info already include the pipe_inode_info. The UMSDOS specific extension is placed after the union. [linux/fs/umsdos/emd.c,194] #Specification: EMD file / spare bytes 10 bytes are unused in each record of the EMD. They are set to 0 all the time. So it will be possible to do new stuff and rely on the state of those bytes in old EMD file around. 6 Hard links [linux/fs/umsdos/namei.c,469] #Specification: hard link / strategy Well ... hard link are difficult to implement on top of an MsDOS fat file system. Unlike UNIX file systems, there are no inode. A directory entry hold the functionality of the inode and the entry. We will used the same strategy as a normal Unix file system (with inode) except we will do it symbolically (using paths). Because anything can happen during a DOS session (defragment, directory sorting, etc...), we can't rely on MsDOS pseudo inode number to record the link. For this reason, the link will be done using hidden symbolic links. The following scenario illustrate how it work. Given a file /foo/file ln /foo/file /tmp/file2 become internally mv /foo/file /foo/-LINK1 ln -s /foo/-LINK1 /foo/file ln -s /foo/-LINK1 /tmp/file2 Using this strategy, we can operate on /foo/file or /foo/file2. We can remove one and keep the other, like a normal Unix hard link. We can rename /foo/file or /tmp/file2 independently. The entry -LINK1 will be hidden. It will hold a link count. When all link are erased, the hidden file is erased too. [linux/fs/umsdos/namei.c,541] #Specification: hard link / directory A hard link can't be made on a directory. EPERM is returned 6 in this case. [linux/fs/umsdos/emd.c,405] #Specification: hard link / hidden name When a hard link is created, the original file is renamed to a hidden name. The name is "..LINKNNN" where NNN is a number define from the entry offset in the EMD file. [linux/fs/umsdos/namei.c,560] #Specification: hard link / first hard link The first time a hard link is done on a file, this file must be renamed and hidden. Then an internal symbolic link must be done on the hidden file. The second link is done after on this hidden file. It is expected that the Linux MSDOS file system keeps the same pseudo inode when a rename operation is done on a file in the same directory. [linux/fs/umsdos/namei.c,912] #Specification: hard link / deleting a link When we deletes a file, and this file is a link we must subtract 1 to the nlink field of the hidden link. If the count goes to 0, we delete this hidden link too. 7 Symbolic links [linux/fs/umsdos/namei.c,407] #Specification: symbolic links / strategy A symbolic link is simply a file which hold a path. It is implemented as a normal MSDOS file (not very space efficient :-() I see 2 different way to do it. One is to place the link data in unused entry of the EMD file. The other is to have a separate file dedicated to hold all symbolic links data. Lets go for simplicity... 8 Special file [linux/fs/umsdos/namei.c,729] #Specification: Special files / strategy Device special file, pipes, etc ... are created like normal 7 file in the msdos file system. Of course they remain empty. One strategy was to create those files only in the EMD file since they were not important for MSDOS. The problem with that, is that there were not getting inode number allocated. The MSDOS filesystems is playing a nice game to fake inode number, so why not use it. The absence of inode number compatible with those allocated for ordinary files was causing major trouble with hard link in particular and other parts of the kernel I guess. 9 Pseudo root [linux/fs/umsdos/inode.c,416] #Specification: pseudo root / mount When a umsdos fs is mounted, a special handling is done if it is the root partition. We check for the presence of the file /linux/etc/init or /linux/etc/rc. If one is there, we do a chroot("/linux"). We check both because (see init/main.c) the kernel try to exec init at different place and if it fails it tries /bin/sh /etc/rc. To be consistent with init/main.c, many more test would have to be done to locate init. Any complain ? The chroot is done manually in init/main.c but the info (the inode) is located at mount time and store in a global variable (pseudo_root) which is used at different place in the umsdos driver. There is no need to store this variable elsewhere because it will always be one, not one per mount. This feature allows the installation of a linux system within a DOS system in a subdirectory. A user may install its linux stuff in c:\linux avoiding any clash with existing DOS file and subdirectory. When linux boots, it hides this fact, showing a normal root directory with /etc /bin /tmp ... The word "linux" is hardcoded in /usr/include/linux/umsdos_fs.h in the macro UMSDOS_PSDROOT_NAME. [linux/fs/umsdos/dir.c,69] #Specification: pseudo root / directory /DOS When umsdos operates in pseudo root mode (C:\linux is the linux root), it simulate a directory /DOS which points to the real root of the file system. [linux/fs/umsdos/dir.c,486] #Specification: pseudo root / DOS hard coded 8 The pseudo sub-directory DOS in the pseudo root is hard coded. The name is DOS. This is done this way to help standardised the umsdos layout. The idea is that from now on /DOS is a reserved path and nobody will think of using such a path for a package. [linux/fs/umsdos/dir.c,517] #Specification: pseudo root / .. in real root Whenever a lookup is those in the real root for the directory .., and pseudo root is active, the pseudo root is returned. [linux/fs/umsdos/namei.c,166] #Specification: pseudo root / any file creation /DOS The pseudo sub-directory /DOS can't be created! EEXIST is returned. The pseudo sub-directory /DOS can't be removed! EPERM is returned. [linux/fs/umsdos/dir.c,586] #Specification: pseudo root / dir lookup For the same reason as readdir, a lookup in /DOS for the pseudo root directory (linux) will fail. [linux/fs/umsdos/rdir.c,81] #Specification: pseudo root / DOS/.. In the real root directory (c:\), the directory .. is the pseudo root (c:\linux). [linux/fs/umsdos/rdir.c,90] #Specification: pseudo root / DOS/linux Even in the real root directory (c:\), the directory /linux won't show [linux/fs/umsdos/dir.c,548] #Specification: pseudo root / lookup(DOS) A lookup of DOS in the pseudo root will always succeed and return the inode of the real root. [linux/fs/umsdos/dir.c,169] #Specification: pseudo root / reading real root The pseudo root (/linux) is logically erased from the real root. This mean that ls /DOS, won't show "linux". This avoids infinite recursion /DOS/linux/DOS/linux while walking the file system. [linux/fs/umsdos/rdir.c,129] #Specification: pseudo root / rmdir /DOS 9 The pseudo sub-directory /DOS can't be removed! This is done even if the pseudo root is not a Umsdos directory anymore (very unlikely), but an accident (under MsDOS) is always possible. EPERM is returned. 10 Dual mode [linux/fs/umsdos/rdir.c,179] #Specification: dual mode / introduction One goal of UMSDOS is to allow a practical and simple coexistence between MsDOS and Linux in a single partition. Using the EMD file in each directory, UMSDOS add Unix semantics and capabilities to normal DOS file system. To help and simplify coexistence, here is the logic related to the EMD file. If it is missing, then the directory is managed by the MsDOS driver. The names are limited to DOS limits (8.3). No links, no device special and pipe and so on. If it is there, it is the directory. If it is there but empty, then the directory looks empty. The utility umssync allows synchronisation of the real DOS directory and the EMD. Whenever umssync is applied to a directory without EMD, one is created on the fly. The directory is promoted to full unix semantic. Of course, the ls command will show exactly the same content as before the umssync session. It is believed that the user/admin will promote directories to unix semantic as needed. The strategy to implement this is to use two function table (struct inode_operations). One for true UMSDOS directory and one for directory with missing EMD. Functions related to the DOS semantic (but aware of UMSDOS) generally have a "r" prefix (r for real) such as UMSDOS_rlookup, to differentiate from the one with full UMSDOS semantic. [linux/fs/umsdos/rdir.c,113] #Specification: dual mode / rmdir in a DOS directory In a DOS (not EMD in it) directory, we use a reverse strategy compared with an Umsdos directory. We assume that a subdirectory of a DOS directory is also a DOS directory. This is not always true (umssync may be used anywhere), but make sense. So we call msdos_rmdir() directly. If it failed with a -ENOTEMPTY then we check if it is a Umsdos directory. We check if it is really empty (only . .. and --linux-.--- in it). If it is true we remove the EMD and do a msdos_rmdir() again. 10 In a Umsdos directory, we assume all subdirectory are also Umsdos directory, so we check the EMD file first. [linux/fs/umsdos/namei.c,690] #Specification: mkdir / umsdos directory / create EMD When we created a new sub-directory in a UMSDOS directory (one with full UMSDOS semantic), we create immediately an EMD file in the new sub-directory so it inherit UMSDOS semantic. 11 Miscellaneous 11.1 UMSDOS_create [linux/fs/umsdos/namei.c,55] #Specification: file creation / not atomic File creation is a two step process. First we create (allocate) an entry in the EMD file and then (using the entry offset) we build a unique name for MSDOS. We create this name in the msdos space. We have to use semaphore (sleep_on/wake_up) to prevent lookup into a directory when we create a file or directory and to prevent creation while a lookup is going on. Since many lookup may happen at the same time, the semaphore is a counter. Only one creation is allowed at the same time. This protection may not be necessary. The problem arise mainly when a lookup or a readdir is done while a file is partially created. The lookup process see that as a "normal" problem and silently erase the file from the EMD file. Normal because a file may be erased during a MSDOS session, but not removed from the EMD file. The locking is done on a directory per directory basis. Each directory inode has its wait_queue. For some operation like hard link, things even get worse. Many creation must occur at once (atomic). To simplify the design a process is allowed to recursively lock the directory for creation. The pid of the locking process is kept along with a counter so a second level of locking is granted or not. [linux/fs/umsdos/namei.c,177] #Specification: create / . and .. If one try to creates . or .., it always fail and return EEXIST. If one try to delete . or .., it always fail and return EPERM. 11 This should be test at the VFS layer level to avoid duplicating this in all file systems. Any comments ? 11.2 UMSDOS_ioctl_dir [linux/fs/umsdos/ioctl.c,29] #Specification: ioctl / acces Only root (effective id) is allowed to do IOCTL on directory in UMSDOS. EPERM is returned for other user. [linux/fs/umsdos/ioctl.c,37] #Specification: ioctl / prototypes The official prototype for the umsdos ioctl on directory is: int ioctl ( int fd, // File handle of the directory int cmd, // command struct umsdos_ioctl *data) The struct and the commands are defined in linux/umsdos_fs.h. umsdos_progs/umsdosio.c provide an interface in C++ to all these ioctl. umsdos_progs/udosctl is a small utility showing all this. These ioctl generally allow one to work on the EMD or the DOS directory independently. These are essential to implement the synchronise. [linux/fs/umsdos/ioctl.c,58] #Specification: ioctl / UMSDOS_GETVERSION The field version and release of the structure umsdos_ioctl are filled with the version and release number of the fs code in the kernel. This will allow some form of checking. Users won't be able to run incompatible utility such as the synchroniser (umssync). umsdos_progs/umsdosio.c enforce this checking. Return always 0. [linux/fs/umsdos/ioctl.c,72] #Specification: ioctl / UMSDOS_READDIR_DOS One entry is read from the DOS directory at the current file position. The entry is put as is in the dos_dirent field of struct umsdos_ioctl. Return > 0 if success. [linux/fs/umsdos/ioctl.c,193] #Specification: ioctl / UMSDOS_RMDIR_DOS 12 The dos_dirent field of the struct umsdos_ioctl is used to execute a msdos_unlink operation. The d_name and d_reclen fields are used. Return 0 if success. [linux/fs/umsdos/ioctl.c,204] #Specification: ioctl / UMSDOS_STAT_DOS The dos_dirent field of the struct umsdos_ioctl is used to execute a stat operation in the DOS directory. The d_name and d_reclen fields are used. The following field of umsdos_ioctl.stat are filled. st_ino,st_mode,st_size,st_atime,st_mtime,st_ctime, Return 0 if success. [linux/fs/umsdos/ioctl.c,182] #Specification: ioctl / UMSDOS_UNLINK_DOS The dos_dirent field of the struct umsdos_ioctl is used to execute a msdos_unlink operation. The d_name and d_reclen fields are used. Return 0 if success. [linux/fs/umsdos/ioctl.c,147] #Specification: ioctl / UMSDOS_CREAT_EMD The umsdos_dirent field of the struct umsdos_ioctl is used as is to create a new entry in the EMD of the directory. The DOS directory is not modified. No validation is done (yet). Return 0 if success. [linux/fs/umsdos/ioctl.c,81] #Specification: ioctl / UMSDOS_READDIR_EMD One entry is read from the EMD at the current file position. The entry is put as is in the umsdos_dirent field of struct umsdos_ioctl. The corresponding mangled DOS entry name is put in the dos_dirent field. All entries are read including hidden links. Blank entries are skipped. Return > 0 if success. [linux/fs/umsdos/ioctl.c,164] #Specification: ioctl / UMSDOS_UNLINK_EMD The umsdos_dirent field of the struct umsdos_ioctl is used as is to remove an entry from the EMD of the directory. No validation is done (yet). The mode field is used to validate S_ISDIR or S_ISREG. 13 Return 0 if success. [linux/fs/umsdos/ioctl.c,228] #Specification: ioctl / UMSDOS_DOS_SETUP The UMSDOS_DOS_SETUP ioctl allow changing the default permission of the MsDOS file system driver on the fly. The MsDOS driver apply global permission to every file and directory. Normally these permissions are controlled by a mount option. This is not available for root partition, so a special utility (umssetup) is provided to do this, normally in /etc/rc.local. Be aware that this apply ONLY to MsDOS directory (those without EMD --linux-.---). Umsdos directory have independent (standard) permission for each and every file. The field umsdos_dirent provide the information needed. umsdos_dirent.uid and gid sets the owner and group. umsdos_dirent.mode set the permissions flags. [linux/fs/umsdos/ioctl.c,124] #Specification: ioctl / UMSDOS_INIT_EMD The UMSDOS_INIT_EMD command make sure the EMD exist for a directory. If it does not, it is created. Also, it makes sure the directory functions table (struct inode_operations) is set to the UMSDOS semantic. This mean that umssync may be applied to an "opened" msdos directory, and it will change behavior on the fly. Return 0 if success. 11.3 UMSDOS_lookup [linux/fs/umsdos/dir.c,526] #Specification: locating .. / strategy We use the msdos filesystem to locate the parent directory. But it is more complicated than that. We have to step back even further to get the parent of the parent, so we can get the EMD of the parent of the parent. Using the EMD file, we can locate all the info on the parent, such a permissions and owner. [linux/fs/umsdos/dir.c,562] #Specification: umsdos / lookup A lookup for a file is done in two step. First, we locate the file in the EMD file. If not present, we return 14 an error code (-ENOENT). If it is there, we repeat the operation on the msdos file system. If this fails, it means that the file system is not in sync with the emd file. We silently remove this entry from the emd file, and return ENOENT. [linux/fs/umsdos/dir.c,284] #Specification: umsdos / lookup / inode info After successfully reading an inode from the MSDOS filesystem, we use the EMD file to complete it. We update the following field. uid, gid, atime, ctime, mtime, mode. We rely on MSDOS for mtime. If the file was modified during an MSDOS session, at least mtime will be meaningful. We do this only for regular file. We don't rely on MSDOS for mtime for directory because the MSDOS directory date is creation time (strange MSDOS behavior) which fit nowhere in the three UNIX time stamp. [linux/fs/umsdos/dir.c,309] #Specification: umsdos / i_nlink The nlink field of an inode is maintain by the MSDOS file system for directory and by UMSDOS for other file. The logic is that MSDOS is already figuring out what to do for directories and does nothing for other files. For MSDOS, there are no hard link so all file carry nlink==1. UMSDOS use some info in the EMD file to plug the correct value. 11.4 UMSDOS_notify_change [linux/fs/umsdos/inode.c,351] #Specification: notify_change / msdos fs notify_change operation are done only on the EMD file. The msdos fs is not even called. [linux/fs/umsdos/inode.c,295] #Specification: root inode / attributes I don't know yet how this should work. Normally the attributes (permissions bits, owner, times) of a directory are stored in the EMD file of its parent. One thing we could do is store the attributes of the root inode in its own EMD file. A simple entry named "." could be used for this special case. It would be read once when the file system is mounted and update in UMSDOS_notify_change() (right here). 15 I am not sure of the behavior of the root inode for a real UNIX file system. For now, this is a nop. [linux/fs/umsdos/inode.c,288] #Specification: notify_change / i_nlink > 0 notify change is only done for inode with nlink > 0. An inode with nlink == 0 is no longer associated with any entry in the EMD file, so there is nothing to update. 11.5 UMSDOS_readdir [linux/fs/umsdos/dir.c,132] #Specification: umsdos / readdir umsdos_readdir() should fill a struct dirent with an inode number. The cheap way to get it is to do a lookup in the MSDOS directory for each entry processed by the readdir() function. This is not very efficient, but very simple. The other way around is to maintain a copy of the inode number in the EMD file. This is a problem because this has to be maintained in sync using tricks. Remember that MSDOS (the OS) does not update the modification time (mtime) of a directory. There is no easy way to tell that a directory was modified during a DOS session and synchronise the EMD file. Suggestion welcome. So the easy way is used! [linux/fs/umsdos/dir.c,83] #Specification: readdir / . and .. The msdos filesystem manage the . and .. entry properly so the EMD file won't hold any info about it. In readdir, we assume that for the root directory the read position will be 0 for ".", 1 for "..". For a non root directory, the read position will be 0 for "." and 32 for "..". [linux/fs/umsdos/dir.c,204] #Specification: umsdos / readdir / not in MSDOS During a readdir operation, if the file is not in the MSDOS directory anymore, the entry is removed from the EMD file silently. 11.6 mount and UMSDOS_remount_fs 16 [linux/fs/umsdos/inode.c,394] #Specification: mount / options Umsdos run on top of msdos. Currently, it supports no mount option, but happily pass all option received to the msdos driver. I am not sure if all msdos mount option make sense with Umsdos. Here are at least those who are useful. uid= gid= These options affect the operation of umsdos in directories which do not have an EMD file. They behave like normal msdos directory, with all limitation of msdos. 11.7 UMSDOS_rename [linux/fs/umsdos/namei.c,1000] #Specification: rename / new name exist If the destination name already exist, it will silently be removed. EXT2 does it this way and this is the spec of SUNOS. So does UMSDOS. If the destination is an empty directory it will also be removed. [linux/fs/umsdos/namei.c,1008] #Specification: rename / new name exist / possible flaw The code to handle the deletion of the target (file and directory) use to be in umsdos_rename_f, surrounded by proper directory locking. This was insuring that only one process could achieve a rename (modification) operation in the source and destination directory. This was also insuring the operation was "atomic". This has been changed because this was creating a kernel stack overflow (stack is only 4k in the kernel). To avoid the code doing the deletion of the target (if exist) has been moved to a upper layer. umsdos_rename_f is tried once and if it fails with EEXIST, the target is removed and umsdos_rename_f is done again. This makes the code cleaner and (not sure) solve a deadlock problem one tester was experiencing. The point is to mention that possibly, the semantic of "rename" may be wrong. Anyone dare to check that :-) Be aware that IF it is wrong, to produce the problem you will need two process trying to rename a file to the same target at the same time. Again, I am not sure it is a problem at all. 17 11.8 Data structure [linux/fs/umsdos/inode.c,177] #Specification: inode / umsdos info The first time an inode is seen (inode->i_count == 1), the inode number of the EMD file which control this inode is tagged to this inode. It allows operation such as notify_change to be handled. 11.9 Inode management [linux/fs/umsdos/inode.c,242] #Specification: Inode / post initialisation To completely initialise an inode, we need access to the owner directory, so we can locate more info in the EMD file. This is not available the first time the inode is access, we use a value in the inode to tell if it has been finally initialised. At first, we have tried testing i_count but it was causing problem. It is possible that two or more process use the newly accessed inode. While the first one block during the initialisation (probably while reading the EMD file), the others believe all is well because i_count > 1. They go banana with a broken inode. See umsdos_lookup_patch and umsdos_patch_inode. 12 Synchronisation problems [linux/fs/umsdos/namei.c,238] #Specification: create / file exist in DOS Here is a situation. Trying to create a file with UMSDOS. The file is unknown to UMSDOS but already exist in the DOS directory. Here is what we are NOT doing: We could silently assume that everything is fine and allows the creation to succeed. It is possible not all files in the partition are mean to be visible from linux. By trying to create those file in some directory, one user may get access to those file without proper permissions. Looks like a security hole to me. Off course sharing a file system with DOS is some kind of security hole :-) So ? 18 We return EEXIST in this case. The same is true for directory creation. [linux/fs/umsdos/namei.c,685] #Specification: mkdir / Directory already exist in DOS We do the same thing as for file creation. For all user it is an error. 13 Convention and style [linux/fs/umsdos/inode.c,366] #Specification: function name / convention A simple convention for function name has been used in the UMSDOS file system. First all function use the prefix umsdos_ to avoid name clash with other part of the kernel. And standard VFS entry point use the prefix UMSDOS (upper case) so it's easier to tell them apart. [linux/fs/umsdos/namei.c,757] #Specification: style / iput strategy In the UMSDOS project, I am trying to apply a single programming style regarding inode management. Many entry point are receiving an inode to act on, and must do an iput() as soon as they are finished with the inode. For simple case, there is no problem. When you introduce error checking, you end up with many iput placed around the code. The coding style I use all around is one where I am trying to provide independent flow logic (I don't know how to name this). With this style, code is easier to understand but you rapidly get iput() all around. Here is an exemple of what I am trying to avoid. if (a){ ... if(b){ ... } ... if (c){ // Complex state. Was b true ? ... } ... } // Weird state if (d){ // ... 19 } // Was iput finally done ? return status; Here is the style I am using. Still sometime I do the first when things are very simple (or very complicated :-( ) if (a){ if (b){ ... }else if (c){ // A single state gets here } }else if (d){ ... } return status; Again, while this help clarifying the code, I often get a lot of iput(), unlike the first style, where I can place few "strategic" iput(). "strategic" also mean, more difficult to place. So here is the style I will be using from now on in this project. There is always an iput() at the end of a function (which has to do an iput()). One iput by inode. There is also one iput() at the places where a successful operation is achieved. This iput() is often done by a sub-function (often from the msdos file system). So I get one too many iput() ? At the place where an iput() is done, the inode is simply nulled, disabling the last one. if (a){ if (b){ ... }else if (c){ msdos_rmdir(dir,...); dir = NULL; } }else if (d){ ... } iput (dir); return status; Note that the umsdos_lockcreate() and umsdos_unlockcreate() function pair goes against this practice of "forgetting" the inode as soon as possible. [linux/fs/umsdos/inode.c,28] #Specification: convention / PRINTK Printk and printk Here is the convention for the use of printk inside fs/umsdos printk carry important message (error or status). Printk is for debugging (it is a macro defined at the beginning of most source. PRINTK is a nulled Printk macro. 20 This convention makes the source easier to read, and Printk easier to shut off. 14 Weakness and features The UMSDOS file system is somewhat a compromise. Here are the drawback. -Space efficiency. The minimal allocation unit is generally 2k. Also, the MsDOS FAT fs do not support sparse file (file with gap of unallocated blocks). -General performance. UMSDOS run piggy back on top of another FS. This means two directory structure to maintain. -Maximum number of files is limited to 64k. It should be a good fs for many purpose, especially if you have to coexist with DOS. UMSDOS is trying to emulate the UNIX semantics for file system. Here are the known differences and weakness. [linux/fs/umsdos/namei.c,503] #Specification: weakness / hard link The strategy for hard link introduces a side effect that may or may not be acceptable. Here is the sequence mkdir subdir1 touch subdir1/file mkdir subdir2 ln subdir1/file subdir2/file rm subdir1/file rmdir subdir1 rmdir: subdir1: Directory not empty This happen because there is an invisible file (--link) in subdir1 which is referenced by subdir2/file. Any idea ? [linux/fs/umsdos/namei.c,522] #Specification: weakness / hard link / rename directory Another weakness of hard link come from the fact that it is based on hidden symbolic links. Here is an example. mkdir /subdir1 touch /subdir1/file mkdir /subdir2 ln /subdir1/file subdir2/file mv /subdir1 subdir3 ls -l /subdir2/file Since /subdir2/file is a hidden symbolic link to /subdir1/..hlinkNNN, accessing it will fail since /subdir1 does not exist anymore (has been renamed). [linux/fs/umsdos/namei.c,980] #Specification: weakness / rename There is a case where UMSDOS rename has a different behavior 21 than normal UNIX file system. Renaming an open file across directory boundary does not work. Renaming an open file within a directory does work however. The problem (not sure) is in the linux VFS msdos driver. I believe this is not a bug but a design feature, because an inode number represent some sort of directory address in the MSDOS directory structure. So moving the file into another directory does not preserve the inode number. 15 utilities Very little has been done here. Not much is missing though. Look in the directory umsdos_progs. 15.1 The UMSDOS synchroniser [umsdos_progs/util/umssync.c,1] #Specification: utility / synchroniser The UMSDOS synchroniser (umssync) make sure that the EMD file is in sync with the MSDOS directory. File created during a DOS session should be add to the EMD. File removed should erased from the EMD. The UMSDOS file system will operate normally even if the system is out of sync. However, files will be missing from directory search, creating an annoying feeling. There is no easy way this kind of update may be achieved by UMSDOS transparently. Here are the reason: This process take some time for each directory. If there were some access time in MSDOS for directories, then, based on boot time, it would be possible to do it once per directory. It is not the case. When a file is discover in MSDOS which does not exist in the EMD, we need some directives to properly map the file. At least the owner must be known. A set of ioctl are available (wrapper interface in umsdos_progs/umsdosio.c) to allow independant manipulation of the EMD and the DOS directory. A utility is provided. It should be run from /etc/rc. A man page (umssync.8) describe its options. [umsdos_progs/util/umssync.c,425] #Specification: umssync / default creation mode Unless override with command line option, file and directory created by umssync will be owned by root with mode 755 for directories and mode 644 for files. 22 [umsdos_progs/util/umssync.c,435] #Specification: umssync / depth Normally, umssync won't recurse into directory. Option -r allows for depth control. You may specify how deep you want umssync to work. When recursing into directory, umssync will use the owner and group specified on the command line (see option -g and -u). If option -i+ is specified the specs of the sub-directory itself may be used. umssync won't follow symlinks. And it won't cross mount points. [umsdos_progs/util/umssync.c,85] #Specification: umssync / mangled name If a DOS file is missing from the EMD, it is added. If the file has an extension with the first character being a member of the restricted set for mangling, the operation won't be done. A message will be printed. To synchronise back into the EMD, the file must be renamed. If one try to create a such a file with umsdos, it is automaticly mangled, producing a different file name in DOS. This is always done to avoid the following problem Unix command MsDOS file name created ============ ======================= mkdir DIR dir.{__ mkdir dir.{__ dir.{_1 ... mkdir dir.{_1 dir.{10 Now, suppose that dir.{__ does not exist in the EMD. dir.{_1 do exist in DOS. If we try to create it in Umsdos, this will create a mangled name. Mangling in based on the entry offset in the EMD. So if say dir.{_1 exist in DOS, but not in Umsdos, rename it to anything (dir.111) and synchronise it in the EMD with umssync. [umsdos_progs/util/umssync.c,367] #Specification: umssync / mount point umssync won't cross mount point. It means you must specify each mount point separatly. [umsdos_progs/util/umssync.c,544] #Specification: umssync / user mode To execute umssync, the effective user id must be root. It it possible to configure umssync to run setuid root. In this case (when getuid() != geteuid()), umssync show a special behavior: Options -d -f -g -i -u are not 23 available anymore. The inheriting mode is automaticly activated. No way to desactivated. A user should be able to umssync its own directory. If a user apply umssync to a directory, all file uncovered will be given to the owner of the directory with restrictive permissions (600 for files, 700 for directory). Another way would be to limit umssync operation to directory which belong to the user. Suggestion welcome. 15.2 Other [umsdos_progs/util/udump.c,4] #Specification: utilities / udump udump display the content of a --linux-.--- file (EMD file). Simply type: udump file This utility was mainly used to debug the UMSDOS file systems. [umsdos_progs/util/udosctl.c,12] #Specification: umsdos_progs / udosctl The udosctl utility give acces directly to UMSDOS ioctl on directory. udosctl command arg Here are the commands: ls: List the content of dos directory arg. Bypass the EMD file. It uses UMSDOS_READDIR_DOS. create: Create the file arg in the EMD file. Do nothing on the DOS directory. Use UMSDOS_CREAT_UMSDOS. mkdir: Create the directory arg in the EMD file. Do nothing on the DOS directory. Use UMSDOS_CREAT_UMSDOS. rm: Remove the file arg in the DOS directory. Bypass the EMD file. Use UMSDOS_UNLINK_DOS. rmdir: Remove the directory arg in the DOS directory. Bypass the EMD file. 24 Use UMSDOS_RMDIR_DOS. uls: List the content of the EMD and print the corresponding DOS mangled name. It uses UMSDOS_READDIR_EMD. urm: Remove the file arg from the EMD file. Don't touch the DOS directory. Use UMSDOS_UNLINK_UMSDOS. urmdir: Remove the directory arg from the EMD file. Don't touch the DOS directory. Use UMSDOS_UNLINK_UMSDOS. version: Prints the version of the UMSDOS driver running. This program was done mostly for illustration of ioctl use and testing. 16 Test cases The umsdos_progs/tests directory holds two utilities. utstgen is a general test suite "UMSDOS independant". It tests the general behavior of UMSDOS as a true UNIX-like file system. utstspc is truely UMSDOS oriented. It (will) tests the proper behavior of UMSDOS especially when the EMD and the MSDOS directory are out of sync. Currently utstspc is not testing much! 16.1 utstgen [umsdos_progs/tests/utstgen.c,7] #Specification: umsdos / automated test / general utstgen.c is a sequence of test for the UMSDOS file system. These test are not really specific to the UMSDOS file system. You will find extensive testing of some stuff which are specific to the UMSDOS file system. There is a long section on hard link which could hardly fail on a normal UNIX file system and were a nightmare to implement in UMSDOS. [umsdos_progs/tests/gen/hlink.c,108] #Specification: utstgen / hard link / cases / across directory boundary The target of the link is not in the same directory as the new link. [umsdos_progs/tests/gen/hlink.c,121] #Specification: utstgen / hard link / cases / target does not exist 25 Many hard links are attempted to a file which does not exist. [umsdos_progs/tests/gen/hlink.c,127] #Specification: utstgen / hard link / to a directory A hard link can't be made to a directory. [umsdos_progs/tests/gen/hlink.c,50] #Specification: utstgen / hard links / case / link 2 link 2 link ... hlink_simple does test a link made to a link made to a link and so on. On a normal UNIX file system, this test is not really an issue. Given the fact that a hardlink on UMSDOS is a symlink to a hidden file, it make sense to test at least the two cases: hard link to an existing file with no link hard link to an existing file with more than one link. [umsdos_progs/tests/gen/multi.c,67] #Specification: utstgen / multi task / basic test A simple test is performed on a directory, by many task. Only one task must succeeded at a time. The others must fail with specific error code. So we fork 10 time. This test hopes it is a sufficient test :-( [umsdos_progs/tests/gen/file.c,176] #Specification: utstgen / Rename test Rename test are done with files and directories. The following case are tested -In the same directory -In the root directory -In two independant directory -In a subdirectory and the parent The same test is also done on an open file. [umsdos_progs/tests/gen/file.c,199] #Specification: utstgen / Rename test / open file Rename test is done on an open file. We do the following sequence. create a file Open it Rename it Write to the open handle Close it Open the file using the news name Read back the data and check it. 26 This test does not succeed if the file is renamed accross directories. This sounds like a limitation of the linux msdos driver. I am not sure at this point. I hope it is not a critical feature of a Unix file system. Comments are welcome about this topics. Comments with solution also :-) [umsdos_progs/tests/gen/rename.c,104] #Specification: utstgen / rename / destination exist The following rename tests are done. The source is always file1 and the destination file2 always exist. file2 is a file or a directory. Here a the different case. file1 is a file, file2 is a file. file1 is a file, file2 is a hard link to a file. file1 is a file, file2 exist and is a empty directory. file1 is a file, file2 exist and is a non empty directory. file1 is a directory, file2 is a file. file1 is a directory, file2 is a hard link to a file. file1 is a directory, file2 exist and is a empty directory. file1 is a directory, file2 exist and is a non empty directory. This sequence is performed in the same directory and accros. The following combination are tested. path/dir1 -> path/dir1 path/dir1 -> path/dir2 path/dir1 -> path path -> path/dir1 [umsdos_progs/tests/gen/syml.c,49] #Specification: utstgen / symbolic links / link 2 link 2 link ... syml_simple does test the number of connected symlink the kernel can handle (A symlink pointing to another pointing to another ... and finally pointing to something. [umsdos_progs/tests/gen/dir.c,142] #Specification: utstgen / creating . and .. A check is done that the special entries . and .. can't be created nor removed. [umsdos_progs/tests/gen/dir.c,160] #Specification: utstgen / removing a busy directory A check is done that a busy directory can't be removed. Here is the sequence we test. It must fail with EBUSY. mkdir dir cd dir rm ../dir 16.2 utstspc 27 [umsdos_progs/tests/utstspc.c,27] #Specification: umsdos / automated test / specific utstspc.c is a sequence of test for the UMSDOS file system. These tests are specific to the UMSDOS file system. [umsdos_progs/tests/utstspc.c,38] #Specification: utstspc / default environnement utstspc needs to start from a fresh partition (it reformats it). So we normally use it on a floppy. utstspc do mount and umount of that floppy. The default mount point is /mnt and the default drive (for mformat) is F:. This value F: looks very odd. This comes from my own setup. Here is my definition for /etc/mtools A /dev/fd0 12 0 0 0 B /dev/fd1 12 0 0 0 E /dev/fd0h1200 12 80 2 15 # A: 5 1/4 F /dev/fd1H1440 12 80 2 18 # B: 3 1/2 Using /dev/fd0 and /dev/fd1 on A and B, this gives me flexibility for normal operation. I can read any type of floppy without much question. I can't do a mformat A: or B:. I always get an error. I guess /dev/fd0 are flexible driver, so do not impose a format. The entries E and F replicate A and B but this time use the specific device. So I can do a "mformat f:" and expect to format a 1.44 3 1/2 floppy correctly. Of course if you know better, please tell me! [umsdos_progs/tests/utstspc.c,129] #Specification: utstspc / floppy only To avoid desaster, utstspc will only work on floppy. A test is done before everything to ensure that the drive is indeed a floppy. It use /etc/mtools to locate the proper device. It also assume a floppy device have a path starting with "/dev/fd". This is not fool proof! [umsdos_progs/tests/utilspc.c,86] #Specification: utstspc / what's needed utstspc use different other program to achieve its test. It has to reformat, mount, unmount etc... the floppy on which it is doing the test. utstspc assume that the following utility are available. /usr/bin/mformat /etc/mount or /bin/mount /etc/umount or /bin/umount 28 Also, it requieres /etc/mtools [umsdos_progs/tests/spc/hlink.c,119] #Specification: utstspc / hard link / cases / across directory boundary The target of the link is not in the same directory as the new link. [umsdos_progs/tests/spc/hlink.c,144] #Specification: utstspc / hard link / cases / target does not exist Many hard links are attempted to a file which does not exist. [umsdos_progs/tests/spc/hlink.c,134] #Specification: utstspc / hard link / in a DOS directory A test is done to demonstrate that a hard link can't be created in a DOS directory. [umsdos_progs/tests/spc/hlink.c,100] #Specification: utstspc / hard link / subdirectory of a dos directory We create two subdirectory in a DOS directory. We switch those to Umsdos mode (umssync). We set a lot of hard link between those two directories. This test try to demonstrate that the only thing that matter is that both subdirectory must be umsdos directories. But the parents don't have to. [umsdos_progs/tests/spc/hlink.c,150] #Specification: utstspc / hard link / to a directory A hard link can't be made to a directory. [umsdos_progs/tests/spc/hlink.c,50] #Specification: utstspc / hard links / case / link 2 link 2 link ... hlink_simple does test a link made to a link made to a link and so on. On a normal UNIX file system, this test is not really an issue. Given the fact that a hardlink on UMSDOS is a symlink to a hidden file, it make sense to test at least the two cases: hard link to an existing file with no link hard link to an existing file with more than one link. [umsdos_progs/tests/spc/read.c,109] #Specification: utstspc / read write We write files with special pattern and read it back using different blocking scheme. This is to make sure the new read ahead support in the msdos fs is not screwing. [umsdos_progs/tests/spc/read.c,152] #Specification: utstspc / read write / text mode 29 The read/write test is done is text conversion mode also. This prove the read ahead of fs/msdos/file.c is working even in conv=text mode. 17 The MsDOS fs Umsdos runs piggy back on top of the MsDOS fs of linux. Umsdos could be reusable to play the same game on top of another fs (hpfs) but there is different small quirk msdos specific in it. And off course it calls few msdos fs function directly. We will present some specs of the msdos fs here. It is not that much related to the design of Umsdos but it might be one day. [linux/fs/msdos/buffer.c,20] #Specification: msdos / strategy / special device / dummy blocks Many special device (Scsi optical disk for one) use larger hardware sector size. This allows for higher capacity. Most of the time, the MsDOS file system that sit on this device is totally unaligned. It use logically 512 bytes sector size, with logical sector starting in the middle of a hardware block. The bad news is that a hardware sector may hold data own by two different files. This means that the hardware sector must be read, patch and written allmost all the time. Needless to say that it kills write performance on all OS. Internally the linux msdos fs is using 512 bytes logical sector. When accessing such a device, we allocate dummy buffer cache blocks, that we stuff with the information of a real one (1k large). This strategy is used to hide this difference to the core of the msdos fs. The slowdown is not hidden though! [linux/fs/msdos/file.c,60] #Specification: msdos / special devices / mmap Mmapping does work because a special mmap is provide in that case. Note that it is much less efficient than the generic_mmap normally used since it allocate extra buffer. generic_mmap is used for normal device (512 bytes hardware sectors). [linux/fs/msdos/file.c,79] #Specification: msdos / special devices / swap file Swap file can't work on special devices with a large sector size (1024 bytes hard sector). Those devices have a weird 30 MsDOS filesystem layout. Generally a single hardware sector may contain 2 unrelated logical sector. This mean that there is no easy way to do a mapping between disk sector of a file and virtual memory. So swap file is difficult (not available right now) on those devices. Off course, Ext2 does not have this problem. [linux/fs/msdos/buffer.c,68] #Specification: msdos / special device / writing A write is always preceded by a read of the complete block (large hardware sector size). This defeat write performance. There is a possibility to optimize this when writing large chunk by making sure we are filling large block. Volunter ?